4 research outputs found
IMPOSITION: Implicit Backdoor Attack through Scenario Injection
This paper presents a novel backdoor attack called IMPlicit BackdOor Attack
through Scenario InjecTION (IMPOSITION) that does not require direct poisoning
of the training data. Instead, the attack leverages a realistic scenario from
the training data as a trigger to manipulate the model's output during
inference. This type of attack is particularly dangerous as it is stealthy and
difficult to detect. The paper focuses on the application of this attack in the
context of Autonomous Driving (AD) systems, specifically targeting the
trajectory prediction module. To implement the attack, we design a trigger
mechanism that mimics a set of cloned behaviors in the driving scene, resulting
in a scenario that triggers the attack. The experimental results demonstrate
that IMPOSITION is effective in attacking trajectory prediction models while
maintaining high performance in untargeted scenarios. Our proposed method
highlights the growing importance of research on the trustworthiness of Deep
Neural Network (DNN) models, particularly in safety-critical applications.
Backdoor attacks pose a significant threat to the safety and reliability of DNN
models, and this paper presents a new perspective on backdooring DNNs. The
proposed IMPOSITION paradigm and the demonstration of its severity in the
context of AD systems are significant contributions of this paper. We highlight
the impact of the proposed attacks via empirical studies showing how IMPOSITION
can easily compromise the safety of AD systems
CRITERIA: a New Benchmarking Paradigm for Evaluating Trajectory Prediction Models for Autonomous Driving
Benchmarking is a common method for evaluating trajectory prediction models
for autonomous driving. Existing benchmarks rely on datasets, which are biased
towards more common scenarios, such as cruising, and distance-based metrics
that are computed by averaging over all scenarios. Following such a regiment
provides a little insight into the properties of the models both in terms of
how well they can handle different scenarios and how admissible and diverse
their outputs are. There exist a number of complementary metrics designed to
measure the admissibility and diversity of trajectories, however, they suffer
from biases, such as length of trajectories.
In this paper, we propose a new benChmarking paRadIgm for evaluaTing
trajEctoRy predIction Approaches (CRITERIA). Particularly, we propose 1) a
method for extracting driving scenarios at varying levels of specificity
according to the structure of the roads, models' performance, and data
properties for fine-grained ranking of prediction models; 2) A set of new
bias-free metrics for measuring diversity, by incorporating the characteristics
of a given scenario, and admissibility, by considering the structure of roads
and kinematic compliancy, motivated by real-world driving constraints. 3) Using
the proposed benchmark, we conduct extensive experimentation on a
representative set of the prediction models using the large scale Argoverse
dataset. We show that the proposed benchmark can produce a more accurate
ranking of the models and serve as a means of characterizing their behavior. We
further present ablation studies to highlight contributions of different
elements that are used to compute the proposed metrics
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Recently, the attention-enriched encoder-decoder framework has aroused great
interest in image captioning due to its overwhelming progress. Many visual
attention models directly leverage meaningful regions to generate image
descriptions. However, seeking a direct transition from visual space to text is
not enough to generate fine-grained captions. This paper exploits a
feature-compounding approach to bring together high-level semantic concepts and
visual information regarding the contextual environment fully end-to-end. Thus,
we propose a stacked cross-modal feature consolidation (SCFC) attention network
for image captioning in which we simultaneously consolidate cross-modal
features through a novel compounding function in a multi-step reasoning
fashion. Besides, we jointly employ spatial information and context-aware
attributes (CAA) as the principal components in our proposed compounding
function, where our CAA provides a concise context-sensitive semantic
representation. To make better use of consolidated features potential, we
further propose an SCFC-LSTM as the caption generator, which can leverage
discriminative semantic information through the caption generation process. The
experimental results indicate that our proposed SCFC can outperform various
state-of-the-art image captioning benchmarks in terms of popular metrics on the
MSCOCO and Flickr30K datasets
Standard SPECT myocardial perfusion estimation from half-time acquisitions using deep convolutional residual neural networks
The purpose of this work was to assess the feasibility of acquisition time reduction in MPI-SPECT imaging using deep leering techniques through two main approaches, namely reduction of the acquisition time per projection and reduction of the number of angular projections